{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### This notebook demonstrates the use of the Reject Option Classification (ROC) post-processing algorithm for bias mitigation.\n",
    "- The debiasing function used is implemented in the `RejectOptionClassification` class.\n",
    "- Divide the dataset into training, validation, and testing partitions.\n",
    "- Train classifier on original training data.\n",
    "- Estimate the optimal classification threshold, that maximizes balanced accuracy without fairness constraints.\n",
    "- Estimate the optimal classification threshold, and the critical region boundary (ROC margin) using a validation set for the desired constraint on fairness. The best parameters are those that maximize the classification threshold while satisfying the fairness constraints.\n",
    "- The constraints can be used on the following fairness measures:\n",
    "    * Statistical parity difference on the predictions of the classifier\n",
    "    * Average odds difference for the classifier\n",
    "    * Equal opportunity difference for the classifier\n",
    "- Determine the prediction scores for testing data. Using the estimated optimal classification threshold, compute accuracy and fairness metrics.\n",
    "- Using the determined optimal classification threshold and the ROC margin, adjust the predictions. Report accuracy and fairness metric on the new predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "# Load all necessary packages\n",
    "import sys\n",
    "sys.path.append(\"../\")\n",
    "import numpy as np\n",
    "from tqdm import tqdm\n",
    "from warnings import warn\n",
    "\n",
    "from aif360.datasets import BinaryLabelDataset\n",
    "from aif360.datasets import AdultDataset, GermanDataset, CompasDataset\n",
    "from aif360.metrics import ClassificationMetric, BinaryLabelDatasetMetric\n",
    "from aif360.metrics.utils import compute_boolean_conditioning_vector\n",
    "from aif360.algorithms.preprocessing.optim_preproc_helpers.data_preproc_functions\\\n",
    "        import load_preproc_data_adult, load_preproc_data_german, load_preproc_data_compas\n",
    "from aif360.algorithms.postprocessing.reject_option_classification\\\n",
    "        import RejectOptionClassification\n",
    "from common_utils import compute_metrics\n",
    "\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "from sklearn.metrics import accuracy_score\n",
    "\n",
    "from IPython.display import Markdown, display\n",
    "import matplotlib.pyplot as plt\n",
    "from ipywidgets import interactive, FloatSlider"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load dataset and specify options"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "## import dataset\n",
    "dataset_used = \"adult\" # \"adult\", \"german\", \"compas\"\n",
    "protected_attribute_used = 1 # 1, 2\n",
    "\n",
    "if dataset_used == \"adult\":\n",
    "#     dataset_orig = AdultDataset()\n",
    "    if protected_attribute_used == 1:\n",
    "        privileged_groups = [{'sex': 1}]\n",
    "        unprivileged_groups = [{'sex': 0}]\n",
    "        dataset_orig = load_preproc_data_adult(['sex'])\n",
    "    else:\n",
    "        privileged_groups = [{'race': 1}]\n",
    "        unprivileged_groups = [{'race': 0}]\n",
    "        dataset_orig = load_preproc_data_adult(['race'])\n",
    "    \n",
    "elif dataset_used == \"german\":\n",
    "#     dataset_orig = GermanDataset()\n",
    "    if protected_attribute_used == 1:\n",
    "        privileged_groups = [{'sex': 1}]\n",
    "        unprivileged_groups = [{'sex': 0}]\n",
    "        dataset_orig = load_preproc_data_german(['sex'])\n",
    "    else:\n",
    "        privileged_groups = [{'age': 1}]\n",
    "        unprivileged_groups = [{'age': 0}]\n",
    "        dataset_orig = load_preproc_data_german(['age'])\n",
    "    \n",
    "elif dataset_used == \"compas\":\n",
    "#     dataset_orig = CompasDataset()\n",
    "    if protected_attribute_used == 1:\n",
    "        privileged_groups = [{'sex': 1}]\n",
    "        unprivileged_groups = [{'sex': 0}]\n",
    "        dataset_orig = load_preproc_data_compas(['sex'])\n",
    "    else:\n",
    "        privileged_groups = [{'race': 1}]\n",
    "        unprivileged_groups = [{'race': 0}]  \n",
    "        dataset_orig = load_preproc_data_compas(['race'])\n",
    "\n",
    "        \n",
    "# Metric used (should be one of allowed_metrics)\n",
    "metric_name = \"Statistical parity difference\"\n",
    "\n",
    "# Upper and lower bound on the fairness metric used\n",
    "metric_ub = 0.05\n",
    "metric_lb = -0.05\n",
    "        \n",
    "#random seed for calibrated equal odds prediction\n",
    "np.random.seed(1)\n",
    "\n",
    "# Verify metric name\n",
    "allowed_metrics = [\"Statistical parity difference\",\n",
    "                   \"Average odds difference\",\n",
    "                   \"Equal opportunity difference\"]\n",
    "if metric_name not in allowed_metrics:\n",
    "    raise ValueError(\"Metric name should be one of allowed metrics\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Split into train, test and validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the dataset and split into train and test\n",
    "dataset_orig_train, dataset_orig_vt = dataset_orig.split([0.7], shuffle=True)\n",
    "dataset_orig_valid, dataset_orig_test = dataset_orig_vt.split([0.5], shuffle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Clean up training data and display properties of the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Training Dataset shape"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(34189, 18)\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "#### Favorable and unfavorable labels"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1.0, 0.0)\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "#### Protected attribute names"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['sex']\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "#### Privileged and unprivileged protected attribute values"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "([array([1.])], [array([0.])])\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "#### Dataset feature names"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['race', 'sex', 'Age (decade)=10', 'Age (decade)=20', 'Age (decade)=30', 'Age (decade)=40', 'Age (decade)=50', 'Age (decade)=60', 'Age (decade)=>=70', 'Education Years=6', 'Education Years=7', 'Education Years=8', 'Education Years=9', 'Education Years=10', 'Education Years=11', 'Education Years=12', 'Education Years=<6', 'Education Years=>12']\n"
     ]
    }
   ],
   "source": [
    "# print out some labels, names, etc.\n",
    "display(Markdown(\"#### Training Dataset shape\"))\n",
    "print(dataset_orig_train.features.shape)\n",
    "display(Markdown(\"#### Favorable and unfavorable labels\"))\n",
    "print(dataset_orig_train.favorable_label, dataset_orig_train.unfavorable_label)\n",
    "display(Markdown(\"#### Protected attribute names\"))\n",
    "print(dataset_orig_train.protected_attribute_names)\n",
    "display(Markdown(\"#### Privileged and unprivileged protected attribute values\"))\n",
    "print(dataset_orig_train.privileged_protected_attributes, \n",
    "      dataset_orig_train.unprivileged_protected_attributes)\n",
    "display(Markdown(\"#### Dataset feature names\"))\n",
    "print(dataset_orig_train.feature_names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Metric for original training data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Original training dataset"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Difference in mean outcomes between unprivileged and privileged groups = -0.190698\n"
     ]
    }
   ],
   "source": [
    "metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, \n",
    "                                             unprivileged_groups=unprivileged_groups,\n",
    "                                             privileged_groups=privileged_groups)\n",
    "display(Markdown(\"#### Original training dataset\"))\n",
    "print(\"Difference in mean outcomes between unprivileged and privileged groups = %f\" % metric_orig_train.mean_difference())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Train classifier on original data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Logistic regression classifier and predictions\n",
    "scale_orig = StandardScaler()\n",
    "X_train = scale_orig.fit_transform(dataset_orig_train.features)\n",
    "y_train = dataset_orig_train.labels.ravel()\n",
    "\n",
    "lmod = LogisticRegression()\n",
    "lmod.fit(X_train, y_train)\n",
    "y_train_pred = lmod.predict(X_train)\n",
    "\n",
    "# positive class index\n",
    "pos_ind = np.where(lmod.classes_ == dataset_orig_train.favorable_label)[0][0]\n",
    "\n",
    "dataset_orig_train_pred = dataset_orig_train.copy(deepcopy=True)\n",
    "dataset_orig_train_pred.labels = y_train_pred"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Obtain scores for validation and test sets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset_orig_valid_pred = dataset_orig_valid.copy(deepcopy=True)\n",
    "X_valid = scale_orig.transform(dataset_orig_valid_pred.features)\n",
    "y_valid = dataset_orig_valid_pred.labels\n",
    "dataset_orig_valid_pred.scores = lmod.predict_proba(X_valid)[:,pos_ind].reshape(-1,1)\n",
    "\n",
    "dataset_orig_test_pred = dataset_orig_test.copy(deepcopy=True)\n",
    "X_test = scale_orig.transform(dataset_orig_test_pred.features)\n",
    "y_test = dataset_orig_test_pred.labels\n",
    "dataset_orig_test_pred.scores = lmod.predict_proba(X_test)[:,pos_ind].reshape(-1,1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Find the optimal parameters from the validation set"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Best threshold for classification only (no fairness)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Best balanced accuracy (no fairness constraints) = 0.7473\n",
      "Optimal classification threshold (no fairness constraints) = 0.2674\n"
     ]
    }
   ],
   "source": [
    "num_thresh = 100\n",
    "ba_arr = np.zeros(num_thresh)\n",
    "class_thresh_arr = np.linspace(0.01, 0.99, num_thresh)\n",
    "for idx, class_thresh in enumerate(class_thresh_arr):\n",
    "    \n",
    "    fav_inds = dataset_orig_valid_pred.scores > class_thresh\n",
    "    dataset_orig_valid_pred.labels[fav_inds] = dataset_orig_valid_pred.favorable_label\n",
    "    dataset_orig_valid_pred.labels[~fav_inds] = dataset_orig_valid_pred.unfavorable_label\n",
    "    \n",
    "    classified_metric_orig_valid = ClassificationMetric(dataset_orig_valid,\n",
    "                                             dataset_orig_valid_pred, \n",
    "                                             unprivileged_groups=unprivileged_groups,\n",
    "                                             privileged_groups=privileged_groups)\n",
    "    \n",
    "    ba_arr[idx] = 0.5*(classified_metric_orig_valid.true_positive_rate()\\\n",
    "                       +classified_metric_orig_valid.true_negative_rate())\n",
    "\n",
    "best_ind = np.where(ba_arr == np.max(ba_arr))[0][0]\n",
    "best_class_thresh = class_thresh_arr[best_ind]\n",
    "\n",
    "print(\"Best balanced accuracy (no fairness constraints) = %.4f\" % np.max(ba_arr))\n",
    "print(\"Optimal classification threshold (no fairness constraints) = %.4f\" % best_class_thresh)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Estimate optimal parameters for the ROC method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "ROC = RejectOptionClassification(unprivileged_groups=unprivileged_groups, \n",
    "                                 privileged_groups=privileged_groups, \n",
    "                                 low_class_thresh=0.01, high_class_thresh=0.99,\n",
    "                                  num_class_thresh=100, num_ROC_margin=50,\n",
    "                                  metric_name=metric_name,\n",
    "                                  metric_ub=metric_ub, metric_lb=metric_lb)\n",
    "ROC = ROC.fit(dataset_orig_valid, dataset_orig_valid_pred)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Optimal classification threshold (with fairness constraints) = 0.5049\n",
      "Optimal ROC margin = 0.1819\n"
     ]
    }
   ],
   "source": [
    "print(\"Optimal classification threshold (with fairness constraints) = %.4f\" % ROC.classification_threshold)\n",
    "print(\"Optimal ROC margin = %.4f\" % ROC.ROC_margin)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Predictions from Validation Set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Validation set"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "##### Raw predictions - No fairness constraints, only maximizing balanced accuracy"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Balanced accuracy = 0.7473\n",
      "Statistical parity difference = -0.3703\n",
      "Disparate impact = 0.2687\n",
      "Average odds difference = -0.2910\n",
      "Equal opportunity difference = -0.3066\n",
      "Theil index = 0.1123\n"
     ]
    }
   ],
   "source": [
    "# Metrics for the test set\n",
    "fav_inds = dataset_orig_valid_pred.scores > best_class_thresh\n",
    "dataset_orig_valid_pred.labels[fav_inds] = dataset_orig_valid_pred.favorable_label\n",
    "dataset_orig_valid_pred.labels[~fav_inds] = dataset_orig_valid_pred.unfavorable_label\n",
    "\n",
    "display(Markdown(\"#### Validation set\"))\n",
    "display(Markdown(\"##### Raw predictions - No fairness constraints, only maximizing balanced accuracy\"))\n",
    "\n",
    "metric_valid_bef = compute_metrics(dataset_orig_valid, dataset_orig_valid_pred, \n",
    "                unprivileged_groups, privileged_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Validation set"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "##### Transformed predictions - With fairness constraints"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Balanced accuracy = 0.6051\n",
      "Statistical parity difference = -0.0436\n",
      "Disparate impact = 0.6107\n",
      "Average odds difference = -0.0049\n",
      "Equal opportunity difference = -0.0136\n",
      "Theil index = 0.2184\n"
     ]
    }
   ],
   "source": [
    "# Transform the validation set\n",
    "dataset_transf_valid_pred = ROC.predict(dataset_orig_valid_pred)\n",
    "\n",
    "display(Markdown(\"#### Validation set\"))\n",
    "display(Markdown(\"##### Transformed predictions - With fairness constraints\"))\n",
    "metric_valid_aft = compute_metrics(dataset_orig_valid, dataset_transf_valid_pred, \n",
    "                unprivileged_groups, privileged_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Testing: Check if the metric optimized has not become worse\n",
    "assert np.abs(metric_valid_aft[metric_name]) <= np.abs(metric_valid_bef[metric_name])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Predictions from Test Set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Test set"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "##### Raw predictions - No fairness constraints, only maximizing balanced accuracy"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Balanced accuracy = 0.7417\n",
      "Statistical parity difference = -0.3576\n",
      "Disparate impact = 0.2774\n",
      "Average odds difference = -0.3281\n",
      "Equal opportunity difference = -0.4001\n",
      "Theil index = 0.1128\n"
     ]
    }
   ],
   "source": [
    "# Metrics for the test set\n",
    "fav_inds = dataset_orig_test_pred.scores > best_class_thresh\n",
    "dataset_orig_test_pred.labels[fav_inds] = dataset_orig_test_pred.favorable_label\n",
    "dataset_orig_test_pred.labels[~fav_inds] = dataset_orig_test_pred.unfavorable_label\n",
    "\n",
    "display(Markdown(\"#### Test set\"))\n",
    "display(Markdown(\"##### Raw predictions - No fairness constraints, only maximizing balanced accuracy\"))\n",
    "\n",
    "metric_test_bef = compute_metrics(dataset_orig_test, dataset_orig_test_pred, \n",
    "                unprivileged_groups, privileged_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "#### Test set"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "##### Transformed predictions - With fairness constraints"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Balanced accuracy = 0.5968\n",
      "Statistical parity difference = -0.0340\n",
      "Disparate impact = 0.6932\n",
      "Average odds difference = -0.0151\n",
      "Equal opportunity difference = -0.0415\n",
      "Theil index = 0.2133\n"
     ]
    }
   ],
   "source": [
    "# Metrics for the transformed test set\n",
    "dataset_transf_test_pred = ROC.predict(dataset_orig_test_pred)\n",
    "\n",
    "display(Markdown(\"#### Test set\"))\n",
    "display(Markdown(\"##### Transformed predictions - With fairness constraints\"))\n",
    "metric_test_aft = compute_metrics(dataset_orig_test, dataset_transf_test_pred, \n",
    "                unprivileged_groups, privileged_groups)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary of Optimal Parameters\n",
    "We show the optimal parameters for all combinations of metrics optimized, datasets, and protected attributes below."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fairness Metric: Statistical parity difference, Accuracy Metric: Balanced accuracy\n",
    "\n",
    "#### Performance\n",
    "\n",
    "| Dataset |Sex (Acc-Bef)|Sex (Acc-Aft)|Sex (Fair-Bef)|Sex (Fair-Aft)|Race/Age (Acc-Bef)|Race/Age (Acc-Aft)|Race/Age (Fair-Bef)|Race/Age (Fair-Aft)|\n",
    "|-|-|-|-|-|-|-|-|-|\n",
    "|Adult (Valid)|0.7473|0.6051|-0.3703|-0.0436|0.7473|0.6198|-0.2226|-0.0007|\n",
    "|Adult (Test)|0.7417|0.5968|-0.3576|-0.0340|0.7417|0.6202|-0.2279|0.0006|\n",
    "|German (Valid)|0.6930|0.6991|-0.0613|0.0429|0.6930|0.6607|-0.2525|-0.0328|\n",
    "|German (Test)|0.6524|0.6460|-0.0025|0.0410|0.6524|0.6317|-0.3231|-0.1038|\n",
    "|Compas (Valid)|0.6599|0.6400|-0.2802|0.0234|0.6599|0.6646|-0.3225|-0.0471|\n",
    "|Compas (Test)|0.6774|0.6746|-0.2724|-0.0313|0.6774|0.6512|-0.2494|0.0578|\n",
    "\n",
    "#### Optimal Parameters\n",
    "\n",
    "| Dataset |Sex (Class. thresh.)|Sex (Class. thresh. - fairness)|Sex (ROC margin - fairness)| Race/Age (Class. thresh.)|Race/Age (Class. thresh. - fairness)|Race/Age (ROC margin - fairness)|\n",
    "|-|-|-|-|-|-|-|\n",
    "|Adult|0.2674|0.5049|0.1819|0.2674|0.5049|0.0808|\n",
    "|German|0.6732|0.6237|0.0538|0.6732|0.7029|0.0728|\n",
    "|Compas|0.5148|0.5841|0.0679|0.5148|0.5841|0.0679|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fairness Metric: Average odds difference, Accuracy Metric: Balanced accuracy\n",
    "\n",
    "#### Performance\n",
    "\n",
    "| Dataset |Sex (Acc-Bef)|Sex (Acc-Aft)|Sex (Fair-Bef)|Sex (Fair-Aft)|Race/Age (Acc-Bef)|Race/Age (Acc-Aft)|Race/Age (Fair-Bef)|Race/Age (Fair-Aft)|\n",
    "|-|-|-|-|-|-|-|-|-|\n",
    "|Adult (Valid)|0.7473|0.6058|-0.2910|-0.0385|0.7473|0.6593|-0.1947|-0.0444|\n",
    "|Adult (Test)|0.7417|0.6024|-0.3281|-0.0438|0.7417|0.6611|-0.1991|-0.0121|\n",
    "|German (Valid)|0.6930|0.6930|-0.0039|-0.0039|0.6930|0.6807|-0.0919|-0.0193|\n",
    "|German (Test)|0.6524|0.6571|0.0071|0.0237|0.6524|0.6587|-0.3278|-0.2708|\n",
    "|Compas (Valid)|0.6599|0.6416|-0.2285|-0.0332|0.6599|0.6646|-0.2918|-0.0105|\n",
    "|Compas (Test)|0.6774|0.6721|-0.2439|-0.0716|0.6774|0.6512|-0.1927|0.1145|\n",
    "\n",
    "#### Optimal Parameters\n",
    "\n",
    "| Dataset |Sex (Class. thresh.)|Sex (Class. thresh. - fairness)|Sex (ROC margin - fairness)| Race/Age (Class. thresh.)|Race/Age (Class. thresh. - fairness)|Race/Age (ROC margin - fairness)|\n",
    "|-|-|-|-|-|-|-|\n",
    "|Adult|0.2674|0.5049|0.1212|0.2674|0.5049|0.0505|\n",
    "|German|0.6732|0.6633|0.0137|0.6732|0.6732|0.0467|\n",
    "|Compas|0.5148|0.5742|0.0608|0.5148|0.5841|0.0679|\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fairness Metric: Equal opportunity difference, Accuracy Metric: Balanced accuracy\n",
    "\n",
    "#### Performance\n",
    "\n",
    "| Dataset |Sex (Acc-Bef)|Sex (Acc-Aft)|Sex (Fair-Bef)|Sex (Fair-Aft)|Race/Age (Acc-Bef)|Race/Age (Acc-Aft)|Race/Age (Fair-Bef)|Race/Age (Fair-Aft)|\n",
    "|-|-|-|-|-|-|-|-|-|\n",
    "|Adult (Valid)|0.7473|0.6051|-0.3066|-0.0136|0.7473|0.6198|-0.2285|0.0287|\n",
    "|Adult (Test)|0.7417|0.5968|-0.4001|-0.0415|0.7417|0.6202|-0.2165|0.1193|\n",
    "|German (Valid)|0.6930|0.6930|-0.0347|-0.0347|0.6930|0.6597|0.1162|-0.0210|\n",
    "|German (Test)|0.6524|0.6571|0.0400|0.0733|0.6524|0.6190|-0.3556|-0.4333|\n",
    "|Compas (Valid)|0.6599|0.6416|-0.1938|0.0244|0.6599|0.6646|-0.2315|0.0002|\n",
    "|Compas (Test)|0.6774|0.6721|-0.1392|0.0236|0.6774|0.6512|-0.1877|0.1196|\n",
    "\n",
    "#### Optimal Parameters\n",
    "\n",
    "| Dataset |Sex (Class. thresh.)|Sex (Class. thresh. - fairness)|Sex (ROC margin - fairness)| Race/Age (Class. thresh.)|Race/Age (Class. thresh. - fairness)|Race/Age (ROC margin - fairness)|\n",
    "|-|-|-|-|-|-|-|\n",
    "|Adult|0.2674|0.5049|0.1819|0.2674|0.5049|0.0808|\n",
    "|German|0.6732|0.6633|0.0137|0.6732|0.6039|0.0000|\n",
    "|Compas|0.5148|0.5742|0.0608|0.5148|0.5841|0.0679|\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}